Efficient Web-Based Data Imputation with Graph Model
نویسندگان
چکیده
A challenge for data imputation is the lack of knowledge. In this paper, we attempt to address this challenge by involving extra knowledge from web. To achieve high-performance web-based imputation, we use the dependency, i.e. FDs and CFDs, to impute as many as possible values automatically and fill in the other missing values with the minimal access of web, whose cost is relatively large. To make sufficient use of dependencies, We model the dependency set on the data as a graph and perform automatical imputation and keywords generation for web-based imputation based on such graph model. With the generated keywords, we design two algorithms to extract values for imputation from the search results. Extensive experimental results based on real-world data collections show that the proposed approach could impute missing values efficiently and effectively compared to existing approach.
منابع مشابه
Automatic Service Composition Based on Graph Coloring
Web services as independent software components are published on the Internet by service providers and services are then called by users’ request. However, in many cases, no service alone can be found in the service repository that could satisfy the applicant satisfaction. Service composition provides new components by using an interactive model to accelerate the programs. Prior to service comp...
متن کاملAutomatic Service Composition Based on Graph Coloring
Web services as independent software components are published on the Internet by service providers and services are then called by users’ request. However, in many cases, no service alone can be found in the service repository that could satisfy the applicant satisfaction. Service composition provides new components by using an interactive model to accelerate the programs. Prior to service comp...
متن کاملAccuracy evaluation of different statistical and geostatistical censored data imputation approaches (Case study: Sari Gunay gold deposit)
Most of the geochemical datasets include missing data with different portions and this may cause a significant problem in geostatistical modeling or multivariate analysis of the data. Therefore, it is common to impute the missing data in most of geochemical studies. In this study, three approaches called half detection (HD), multiple imputation (MI), and the cosimulation based on Markov model 2...
متن کاملچند رویکرد برخورد با مقادیر گمشده متغیرهای کمی و بررسی اثر آنها بر نتایج حاصل از یک کارآزمایی بالینی
Background and Objectives: A major challenge that affects the longitudinal studies is the problem of missing data. Missing in the data may result in the loss of part of the information which reduces the accuracy of the estimator and obtain the results will be biased and inaccurate. Therefore, it is necessary to evaluate the missing data mechanism from a longitudinal research and to consider thi...
متن کاملdesigning and implementing a 3D indoor navigation web application
During the recent years, the need arises for indoor navigation systems for guidance of a client in natural hazards and fire, due to the fact that human settlements have been complicating. This research paper aims to design and implement a visual indoor navigation web application. The designed system processes CityGML data model automatically and then, extracts semantic, topologic and geometric...
متن کامل